Arabic Roots Extraction Using Morphological Analysis
نویسندگان
چکیده
The Arabic language is characterized by its rich and complex morphology based on root-pattern schemes. Root extraction is one of the most important topics in the context of natural language processing applications such as information retrieval, text processing, machine translation, speech tagging, etc. This paper presents a method to extract the trilateral roots of Arabic words, acting from the roots of three consonants, through the removal of the prefixes and the suffixes, and the use of a list of morphological weights. Experimental results based on a list of eleven different root inflections shows the effectiveness of the proposed method with a success rate of 94%.
منابع مشابه
Rule-based Approach for Arabic Root Extraction: New Rules to Directly Extract Roots of Arabic Words
Extracting word roots in Arabic language is very problematic due to the specific morphological and structural changes in the language. To address this problem, several techniques have been proposed. This paper continues the problem of identifying and exploiting relationship amongst Arabic letters for Arabic root extraction begun in [1]. Eight different rules that detect the root letters accordi...
متن کاملAn Improved Arabic WordS roots Extraction method using n-Gram Technique
Arabic language is distinguished by its morphological richness, which forces the workers in the field of Arabic language Processing (i.e., information retrieval, document’s classification, text summarizing) to deal with many words that seem to be different but in reality they came from an identical root word. One of the methods to overcome this problem is to return the words to their roots. Thi...
متن کاملA Markovian approach for arabic root extraction
In this paper, we present an Arabic morphological analysis system that assigns, for each word of an unvoweled Arabic sentence, a unique root depending on the context. The proposed system is composed of two modules. The first one consists of an analysis out of context. In this module, we segment each word of the sentence into its elementary morphological units in order to identify its possible r...
متن کاملEnhancing Root Extractors Using Light Stemmers
The rise of Natural Language Processing (NLP) opened new possibilities for various applications that were not applicable before. A morphological-rich language such as Arabic introduces a set of features, such as roots, that would assist the progress of NLP. Many tools were developed to capture the process of root extraction (stemming). Stemmers have improved many NLP tasks without explicit know...
متن کاملUnsupervised Induction of Arabic Root and Pattern Lexicons using Machine Learning
We describe an approach to building a morphological analyser of Arabic by inducing a lexicon of root and pattern templates from an unannotated corpus. Using maximum entropy modelling, we capture orthographic features from surface words, and cluster the words based on the similarity of their possible roots or patterns. From these clusters, we extract root and pattern lexicons, which allows us to...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014